REMAP for video soundtrack indexing

نویسندگان

  • Philippe Gelin
  • Christian Wellekens
چکیده

Indexing of video soundtracks is an important issue for the navigation in multimedia databases. Based on wordspotting techniques, it should meet very constraining specifications; namely fast response to queries, concise processed speech information for limiting the storage memory, speaker independant mode, easy characterization of any word by its phonemic spelling. A solution based on phonemic lattices and on a division of the indexing process into an off-line and an online part is proposed in this paper. Previous works [1][2] based on frame labelling and Maximum Likelihood criterion are now modified to take into account this new approach based on a Maximum a Posteriori (MAP) criterion. The REMAP algorithm [3] implements this MAP criterion for training. It has several avantages such as maximizing the global discriminant criterion, avoiding the difficult problem of phoneme transition detection during the training process and being well suited for a hybrid Hidden Markov Model (HMM) and Neural Network (NN) approach.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On the Applicability of Speaker Diarization to Audio Indexing of Non-Speech and Mixed Non-Speech/Speech Video Soundtracks

A video‘s soundtrack is usually highly correlated to its content. Hence, audio-based techniques have recently emerged as a means for video concept detection complementary to visual analysis. Most state-of-the-art approaches rely on manual definition of predefined sound concepts such as “engine sounds”, “outdoor/indoor sounds”. These approaches come with three major drawbacks: manual definitions...

متن کامل

Characterizing Audio Events for Video Soundtrack Analysis

Characterizing Audio Events for Video Soundtrack Analysis

متن کامل

Keyword spotting enhancement for video soundtrack indexing

Multimedia databases contain an increasing amount of videos that are hardly semantically accessed. Among the useful indices that can be extracted from the sound track, the presence of a keyword at some place plays a prominent role. This paper deals with the specificities of such a keyword spotter and the enhancement brought to our previous technique, [1] based on frame labeling. To be useful, s...

متن کامل

Prototyping The VISION Digital Video Library System

The digital libraries of the future will provide electronic access to information in many di erent forms Recent technological advances make the storage and transmission of digital video information possible This project is to design and implement a digital video library system prototype suitable for storage indexing and retrieving video and audio information and providing that information acros...

متن کامل

MusiClef 2013: Soundtrack Selection for Commercials

MusiClef was one of the“brave new tasks”at MediaEval 2013 with a multimodal approach that combined music, video and textual information in order to evaluate systems that recommend a music soundtrack given the video of a commercial and the information on the product to be advertised.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997